Session 4 of the Multiplexed Tissue Imaging Workshop focuses on single-cell analysis and pre-prcessing. The first section will demonstrate how to perform spillover correction before highlighting a batch correction approach and cell phenotyping.
Small levels of channel-to-channel spillover occurs specifically in IMC which can be corrected for assuming a linear relationship between the signal of one channel and the signal of the neighboring channel.
This section highlights how to generate a spillover matrix from individually acquired single metal spots on an agarose slide. Each spot needs to be imaged as its own acquisition/ROI and individual TXT files containing the pixel intensities per spot need to be available. For complete details on the spillover correction approach, please refer to the original publication
When downloading the data you retrieved an MCD file which contains individually measured metal spots.
In brief, the highlighted workflow comprises 5 steps:
In the first step, we will generate a spillover matrix based on the single-metal spots and save it for later use.
Here, we will read in the individual TXT files into a
SingleCellExperiment object. This object can be used
directly by the CATALYST package to estimate the
spillover.
For this to work, the .txt file names need to contain the spotted
metal isotope name. By default, the first occurrence of the isotope in
the format (mt)(mass) (e.g. Sm152 for Samarium
isotope with the atomic mass 152) will be used as spot identifier.
Alternatively, a named list of already read-in pixel intensities can be
provided. For more inforation, please refer to the man page
?readSCEfromTXT.
For further downstream analysis, we will asinh-transform the data using a cofactor of 5; a common transformation for CyTOF data.
library(imcRtools)
## Warning: package 'imcRtools' was built under R version 4.2.2
## Warning: package 'S4Vectors' was built under R version 4.2.2
## Warning: package 'GenomeInfoDb' was built under R version 4.2.2
# Create SingleCellExperiment from .txt files
sce <- readSCEfromTXT("data/compensation/")
## Spotted channels: Y89, In113, In115, Pr141, Nd142, Nd143, Nd144, Nd145, Nd146, Sm147, Nd148, Sm149, Nd150, Eu151, Sm152, Eu153, Sm154, Gd155, Gd156, Gd158, Tb159, Gd160, Dy161, Dy162, Dy163, Dy164, Ho165, Er166, Er167, Er168, Tm169, Er170, Yb171, Yb172, Yb173, Yb174, Lu175, Yb176
## Acquired channels: Ar80, Y89, In113, In115, Xe131, Xe134, Ba136, La138, Pr141, Nd142, Nd143, Nd144, Nd145, Nd146, Sm147, Nd148, Sm149, Nd150, Eu151, Sm152, Eu153, Sm154, Gd155, Gd156, Gd158, Tb159, Gd160, Dy161, Dy162, Dy163, Dy164, Ho165, Er166, Er167, Er168, Tm169, Er170, Yb171, Yb172, Yb173, Yb174, Lu175, Yb176, Ir191, Ir193, Pt196, Pb206
## Channels spotted but not acquired:
## Channels acquired but not spotted: Ar80, Xe131, Xe134, Ba136, La138, Ir191, Ir193, Pt196, Pb206
assay(sce, "exprs") <- asinh(counts(sce)/5)
In the next step, we will observe the median pixel intensities per spot and threshold on medians < 200 counts. These types of visualization serve two purposes:
Small median pixel intensities (< 200 counts) might hinder the robust estimation of the channel spillover. In that case, consecutive pixels can be summed (see Optional pixel binning).
Each spotted metal (row) should show the highest median pixel intensity in its corresponding channel (column). If this is not the case, either the naming of the .txt files was incorrect or the incorrect metal was spotted.
# Log10 median pixel counts per spot and channel
plotSpotHeatmap(sce)
# Thresholded on 200 pixel counts
plotSpotHeatmap(sce, log = FALSE, threshold = 200)
As we can see, all median pixel intensities are > 200 counts for each spot. We also observe acquired channels for which no spot was placed (Xe134, Ir191, Ir193).
In cases where median pixel intensities are low (< 200 counts),
consecutive pixels can be summed to increase the robustness of the
spillover estimation. The imcRtools package provides the
binAcrossPixels function, which performs aggregation for
each channel across bin_size consecutive pixels per spotted
metal.
# Define grouping
bin_size = 10
sce2 <- binAcrossPixels(sce, bin_size = bin_size)
# Log10 median pixel counts per spot and channel
plotSpotHeatmap(sce2)
# Thresholded on 200 pixel counts
plotSpotHeatmap(sce2, log = FALSE, threshold = 200)
Here, we can see an increase in the median pixel intensities and accumulation of off-diagonal signal. Due to already high original pixel intensities, we will refrain from aggregating across consecutive pixels for this demonstration.
The following step uses functions provided by the
CATALYST package to “debarcode” the pixels. Based on the
intensity distribution of all channels, pixels are assigned to their
corresponding barcode; here this is the already known metal spot. This
procedure serves the purpose to identify pixels that cannot be robustly
assigned to the spotted metal. Pixels of such kind can be regarded as
“noisy”, “background” or “artefacts” that should be removed prior to
spillover estimation.
We will also need to specify which channels were spotted (argument
bc_key). This information is directly contained in the
colData(sce) slot. To facilitate visualization, we will
order the bc_key by mass.
The general workflow for pixel debarcoding is as follows:
library(CATALYST)
bc_key <- as.numeric(unique(sce$sample_mass))
bc_key <- bc_key[order(bc_key)]
sce <- assignPrelim(sce, bc_key = bc_key)
sce <- estCutoffs(sce)
sce <- applyCutoffs(sce)
The obtained SingleCellExperiment now contains the
additional bc_id entry. For each pixel, this vector
indicates the assigned mass (e.g. 161) or 0,
meaning unassigned.
This information can be visualized in form of a heatmap:
library(pheatmap)
cur_table <- table(sce$bc_id, sce$sample_mass)
pheatmap(log10(cur_table + 1),
cluster_rows = FALSE,
cluster_cols = FALSE)
We can see here, that all pixels were assigned to the right mass and that all pixel sets are made up of > 1000 pixels.
However, in cases where incorrect assignment occurred or where few
pixels were measured for some spots, the imcRtools package
exports a simple helper function to exclude pixels based on these
criteria:
sce <- filterPixels(sce, minevents = 40, correct_pixels = TRUE)
Based on the single-positive pixels, we use the
CATALYST::computeSpillmat() function to compute the
spillover matrix and CATALYST::plotSpillmat() to visualize
it. The plotSpillmat function checks the spotted and
acquired metal isotopes against a pre-defined
CATALYST::isotope_list(). In this data, the
Ar80 channel was additionally acquired to check for
deviations in signal intensity. Ar80 needs to be added to a
custom isotope_list object for visualization.
sce <- computeSpillmat(sce)
isotope_list <- CATALYST::isotope_list
isotope_list$Ar <- 80
plotSpillmat(sce, isotope_list = isotope_list)
## Warning: The `guide` argument in `scale_*()` cannot be `FALSE`. This was deprecated in
## ggplot2 3.3.4.
## ℹ Please use "none" instead.
## ℹ The deprecated feature was likely used in the CATALYST package.
## Please report the issue at <]8;;https://github.com/HelenaLC/CATALYST/issueshttps://github.com/HelenaLC/CATALYST/issues]8;;>.
# Save spillover matrix in new object
sm <- metadata(sce)$spillover_matrix
Of note: the visualization of the spillover matrix using CATALYST does currently not visualize spillover between the larger channels. In this case, the spillover matrix is clipped at Yb171.
As we can see, the largest spillover appears in
In113 --> In115 and we also observe the +16
oxide impurities for e.g. Nd148 --> Dy164.
We can save the spillover matrix for external use.
write.csv(sm, "data/sm.csv")
The CATALYST package can be used to perform spillover
compensation on the single-cell mean intensities. Here,
the SpatialExperiment object generated in Section
@ref(read-data) is read in. The CATALYST package requires
an entry to rowData(spe)$channel_name for the
compCytof function to run. This entry should contain the
metal isotopes in the form (mt)(mass)Di (e.g., Sm152Di for
Samarium isotope with the atomic mass 152).
The compCytof function performs channel spillover
compensation on the mean pixel intensities per channel and cell. Here,
we will not overwrite the assays in the SpatialExperiment
object to later highlight the effect of compensation. As shown in
Section @ref(read-data), also the compensated counts are
asinh-transformed using a cofactor of 1.
spe <- readRDS("data/spe.rds")
rowData(spe)$channel_name <- paste0(rowData(spe)$channel, "Di")
spe <- compCytof(spe, sm,
transform = TRUE, cofactor = 1,
isotope_list = isotope_list,
overwrite = FALSE)
To check the effect of channel spillover compensation, the expression of markers that are affected by spillover (e.g., E cadherin in channel Yb173 and CD303 in channel Yb174) can be visualized in form of scatter plots before and after compensation.
library(dittoSeq)
library(patchwork)
before <- dittoScatterPlot(spe, x.var = "Ecad", y.var = "CD303",
assay.x = "exprs", assay.y = "exprs") +
ggtitle("Before compensation")
after <- dittoScatterPlot(spe, x.var = "Ecad", y.var = "CD303",
assay.x = "compexprs", assay.y = "compexprs") +
ggtitle("After compensation")
before + after
We observe that the spillover Yb173 –> Yb174 was successfully corrected. To facilitate further downstream analysis, the non-compensated assays can now be replaced by their compensated counterparts:
assay(spe, "counts") <- assay(spe, "compcounts")
assay(spe, "exprs") <- assay(spe, "compexprs")
assay(spe, "compcounts") <- assay(spe, "compexprs") <- NULL
The cytomapper
package allows channel spillover compensation directly on
multi-channel images. The compImage
function takes a CytoImageList object and the estimated
spillover matrix as input. More info on how to work with
CytoImageList objects can be seen in Section
@ref(image-visualization).
At this point, we can read in the CytoImageList object
containing multi-channel images as generated in Section @ref(read-data).
The channelNames need to be set according to their metal
isotope in the form (mt)(mass)Di and therefore match
colnames(sm).
library(cytomapper)
images <- readRDS("data/images.rds")
masks <- readRDS("data/masks.rds")
channelNames(images) <- rowData(spe)$channel_name
The CATALYST package provides the adaptSpillmat function
that corrects the spillover matrix in a way that rows and columns match
a predefined set of metals. Please refer to ?compCytof for
more information how metals in the spillover matrix are matched to
acquired channels in the SingleCellExperiment object.
The spillover matrix can now be adapted to exclude channels that are
not part of the measurement (keep == 0).
library(tiff)
panel <- read.csv("data/steinbock/panel.csv")
adapted_sm <- adaptSpillmat(sm, paste0(panel$channel[panel$keep == 1], "Di"),
isotope_list = isotope_list)
## Compensation is likely to be inaccurate.
## Spill values for the following interactions
## have not been estimated:
## Ir191Di -> Ir193Di
## Ir193Di -> Ir191Di
The adpated spillover matrix now matches the
channelNames of the CytoImageList object and
can be used to perform pixel-level spillover compensation.
library(BiocParallel)
## Warning: package 'BiocParallel' was built under R version 4.2.2
library(parallel)
images_comp <- compImage(images, adapted_sm,
BPPARAM = MulticoreParam())
As a sanity check, we will visualize the image before and after compensation:
# Before compensation
plotPixels(images[5], colour_by = "Yb173Di",
image_title = list(text = "Yb173 (Ecad) - before", position = "topleft"),
legend = NULL, bcg = list(Yb173Di = c(0, 4, 1)))
plotPixels(images[5], colour_by = "Yb174Di",
image_title = list(text = "Yb174 (CD303) - before", position = "topleft"),
legend = NULL, bcg = list(Yb174Di = c(0, 4, 1)))
# After compensation
plotPixels(images_comp[5], colour_by = "Yb173Di",
image_title = list(text = "Yb173 (Ecad) - after", position = "topleft"),
legend = NULL, bcg = list(Yb173Di = c(0, 4, 1)))
plotPixels(images_comp[5], colour_by = "Yb174Di",
image_title = list(text = "Yb174 (CD303) - after", position = "topleft"),
legend = NULL, bcg = list(Yb174Di = c(0, 4, 1)))
For convenience, we will re-set the channelNames to
their biological targtes:
channelNames(images_comp) <- rownames(spe)
In the previous Session we observed staining/expression differences between the individual samples. This can arise due to technical (e.g., differences in sample processing) as well as biological (e.g. differential expression between patients/indications) reasons. However, the combination of these effects may hinder cell phenotyping via clustering.
To integrate cells across samples, we can use computational
strategies developed for correcting batch effects in single-cell RNA
sequencing data. Computational tools in R that perform those corrections
include batchelor,
harmony and Seurat.
Due to time constrains, we only highlight the batchelor
package for batch correction.
Of note: the correction approaches presented here aim at removing any differences between samples. This will also remove biological differences between the patients/indications. Nevertheless, integrating cells across samples can facilitate the detection of cell phenotypes via clustering.
The batchelor package provides the
mnnCorrect and fastMNN functions to correct
for differences between samples/batches. Both functions build up on
finding mutual nearest neighbors (MNN) among the cells of different
samples and correct expression differences between the cells. The
mnnCorrect function returns corrected expression counts
while the fastMNN functions performs the correction in
reduced dimension space. As such, fastMNN returns
integrated cells in form of a low dimensional embedding.
Paper: Batch
effects in single-cell RNA-sequencing data are corrected by matching
mutual nearest neighbors
Documentation: batchelor
Here, we apply the fastMNN function to integrate cells
between patients. By setting auto.merge = TRUE the function
estimates the best batch merging order by maximizing the number of MNN
pairs at each merging step. This is more time consuming than merging
sequentially based on how batches appear in the dataset (default). We
again select the markers defined in Section @ref(cell-processing) for
sample correction.
The function returns a SingleCellExperiment object which
contains corrected low-dimensional coordinates for each cell in the
reducedDim(out, "corrected") slot. This low-dimensional
embedding can be further used for clustering and non-linear
dimensionality reduction. We transfer the corrected coordinates to the
main SpatialExperiment object.
library(batchelor)
set.seed(220228)
out <- fastMNN(spe, batch = spe$patient_id,
auto.merge = TRUE,
subset.row = rowData(spe)$use_channel,
assay.type = "exprs")
# Transfer the correction results to the main spe object
reducedDim(spe, "fastMNN") <- reducedDim(out, "corrected")
The fastMNN function further returns outputs that can be
used to assess the quality of the batch correction. The
metadata(out)$merge.info entry collects diagnostics for
each individual merging step. Here, the batch.size and
lost.var entries are important. The batch.size
entry reports the relative magnitude of the batch effect and the
lost.var entry represents the percentage of lost variance
per merging step. A large batch.size and low
lost.var indicate sufficient batch correction.
merge_info <- metadata(out)$merge.info
DataFrame(left = merge_info$left,
right = merge_info$right,
batch.size = merge_info$batch.size,
max_lost_var = rowMax(merge_info$lost.var))
## DataFrame with 3 rows and 4 columns
## left right batch.size max_lost_var
## <List> <List> <numeric> <numeric>
## 1 Patient4 Patient2 0.376540 0.0456565
## 2 Patient4,Patient2 Patient1 0.552928 0.0420837
## 3 Patient4,Patient2,Patient1 Patient3 0.753851 0.0744367
We observe that Patient4 and Patient2 are most similar with a low batch effect. Merging cells of Patient3 into the combined batch of Patient1, Patient2 and Patient4 resulted in the highest percentage of lost variance and the detection of the largest batch effect. In the next paragraph we can visualize the correction results.
The simplest option to check if the sample effects were corrected is by using non-linear dimensionality reduction techniques and observe mixing of cells across samples. We will recompute the UMAP embedding using the corrected low-dimensional coordinates for each cell.
library(scater)
set.seed(220228)
spe <- runUMAP(spe, dimred= "fastMNN", name = "UMAP_mnnCorrected")
Next, we visualize the corrected UMAP while overlaying patient IDs.
library(cowplot)
library(dittoSeq)
library(viridis)
# visualize patient id
p1 <- dittoDimPlot(spe, var = "patient_id",
reduction.use = "UMAP", size = 0.2) +
scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
ggtitle("Patient ID on UMAP before correction")
p2 <- dittoDimPlot(spe, var = "patient_id",
reduction.use = "UMAP_mnnCorrected", size = 0.2) +
scale_color_manual(values = metadata(spe)$color_vectors$patient_id) +
ggtitle("Patient ID on UMAP after correction")
plot_grid(p1, p2)
We observe an imperfect merging of Patient3 into all other samples. This was already seen when displaying the merging information above. We now also visualize the expression of selected markers across all cells before and after batch correction.
markers <- c("Ecad", "CD45RO", "CD20", "CD3", "FOXP3", "CD206", "MPO", "SMA", "Ki67")
# Before correction
plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP",
assay = "exprs", size = 0.2, list.out = TRUE)
plot_list <- lapply(plot_list, function(x) x + scale_color_viridis())
plot_grid(plotlist = plot_list)
# After correction
plot_list <- multi_dittoDimPlot(spe, var = markers, reduction.use = "UMAP_mnnCorrected",
assay = "exprs", size = 0.2, list.out = TRUE)
plot_list <- lapply(plot_list, function(x) x + scale_color_viridis())
plot_grid(plotlist = plot_list)
We observe that immune cells across patients are merged after batch
correction using fastMNN. However, the tumor cells of
different patients still cluster separately.
A common step during single-cell data analysis is the annotation of cells based on their phenotype. Defining cell phenotypes is often subjective and relies on previous biological knowledge. The Orchestrating Single Cell Analysis with Bioconductor book presents a number of approaches to phenotype cells detected by single-cell RNA sequencing based on reference datasets or prior knowledge of geneset importance.
In highly-multiplexed imaging, target proteins or molecules are manually selected based on the biological question at hand. It narrows down the feature space and facilitates the manual annotation of clusters to derive cell phenotypes. We will therefore highlight the use of one clustering approache to group cells based on their similarity in marker expression.
Unlike single-cell RNA sequencing or CyTOF data, single-cell data derived from highly-multiplexed imaging data often suffers from “lateral spillover” between neighboring cells. This spillover caused by imperfect segmentation often hinders accurate clustering to define specific cell phenotypes in multiplexed imaging data. In Section @ref(classification) we will train and apply a random forest classifier to classify cell phenotypes in the dataset as alternative approach for clustering-based cell phenotyping. This approach has been previously used to identify major cell phenotypes in metastatic melanoma and avoids clustering of cells.
We will first sample 2000 cells to visualize cluster membership.
# Sample cells
set.seed(220619)
cur_cells <- sample(seq_len(ncol(spe)), 2000)
In the first section to identify cellular phenotypes in the dataset,
we will present a popular clustering approache that groups cells based
on their similarity in marker expression. A number of approaches have
been developed to cluster data derived from single-cell RNA sequencing
technologies [@Yu2022] or CyTOF [@Weber2016]. For demonstration purposes, we
will only highlight the Phenograph clustering approach. For
a full overview on different clustering approaches, please refer to the
Phenotyping section of the online book.
The PhenoGraph clustering approach was first described to group cells
of a CyTOF dataset [@Levine2015]. The
algorithm first constructs a graph by detecting the k
nearest neighbours based on euclidean distance in expression space. In
the next step, edges between nodes (cells) are weighted by their overlap
in nearest neighbor sets. To quantify the overlap in shared nearest
neighbor sets, the jaccard index is used. The Louvain modularity
optimization approach is used to detect connected communities and
partition the graph into clusters of cells. This clustering strategy was
used by Jackson, Fischer et al. and Schulz et al. to
cluster IMC data [@Jackson2020;
@Schulz2018].
There are several different PhenoGraph implementations available in R. Here, we use the one available at https://github.com/i-cyto/Rphenograph. For large datasets, https://github.com/stuchly/Rphenoannoy offers a more performant implementation of the algorithm.
In the following code chunk, we select the asinh-transformed mean
pixel intensities per cell and channel and subset the channels to the
ones containing biological variation. This matrix is transposed to store
cells in rows. Within the Rphenograph function, we select
the 45 nearest neighbors for graph building and louvain community
detection (default). The function returns a list of length 2, the first
entry being the graph and the second entry containing the community
object. Calling membership on the community object will
return cluster IDs for each cell. These cluster IDs are then stored
within the colData of the SpatialExperiment
object. Cluster IDs are mapped on top of the UMAP embedding and
single-cell marker expression within each cluster are visualized in form
of a heatmap.
It is recommended to test different inputs to k as shown
in the next section. Selecting larger values for k results
in larger clusters.
library(Rphenograph)
library(igraph)
library(dittoSeq)
library(viridis)
mat <- t(assay(spe, "exprs")[rowData(spe)$use_channel,])
out <- Rphenograph(mat, k = 45)
clusters <- factor(membership(out[[2]]))
spe$pg_clusters <- clusters
dittoDimPlot(spe, var = "pg_clusters",
reduction.use = "UMAP", size = 0.2,
do.label = TRUE) +
ggtitle("Phenograph clusters expression on UMAP")
dittoHeatmap(spe[,cur_cells],
genes = rownames(spe)[rowData(spe)$use_channel],
assay = "exprs", scale = "none",
heatmap.colors = viridis(100),
annot.by = c("pg_clusters", "patient_id"),
annot.colors = c(dittoColors(1)[1:length(unique(spe$pg_clusters))],
metadata(spe)$color_vectors$patient_id))
We can observe that some of the clusters only contain cells of a
single patient. This can often be observed in the tumor compartment. In
the next step, we use the integrated cells (see Section
@ref(batch-effects)) in low dimensional embedding for clustering. Here,
the low dimensional embedding can be directly accessed from the
reducedDim slot.
mat <- reducedDim(spe, "fastMNN")
out <- Rphenograph(mat, k = 45)
clusters <- factor(membership(out[[2]]))
spe$pg_clusters_corrected <- clusters
dittoDimPlot(spe, var = "pg_clusters_corrected",
reduction.use = "UMAP_mnnCorrected", size = 0.2,
do.label = TRUE) +
ggtitle("Phenograph clusters expression on UMAP, integrated cells")
dittoHeatmap(spe[,cur_cells],
genes = rownames(spe)[rowData(spe)$use_channel],
assay = "exprs", scale = "none",
heatmap.colors = viridis(100),
annot.by = c("pg_clusters_corrected", "pg_clusters","patient_id"),
annot.colors = c(dittoColors(1)[1:length(unique(spe$pg_clusters_corrected))],
dittoColors(1)[1:length(unique(spe$pg_clusters))],
metadata(spe)$color_vectors$patient_id))
Clustering using the integrated embedding leads to clusters that contain cells of different patients. Cluster annotation can now be performed by manually labeling cells based on their marker expression.
In this section, we will highlight a cell type classification approach based on ground truth labeling and random forest classification. The rational for this supervised cell phenotyping approach is to use the information contained in the pre-defined markers to detect cells of interest. This approach was used by Hoch et al. (2022) to classify cell types in a metastatic melanoma IMC dataset.
The antibody panel used in the example data set mainly focuses on immune cell types and little on tumor cell phenotypes. Therefore we will label the following cell types:
The “B next to T cell” phenotype (BnTcell) is commonly
observed in immune infiltrated regions measured by IMC. We include this
phenotype to account for B cell/T cell interactions where precise
classification into B cells or T cells is not possible.
As related approaches, Astir and Garnett use pre-defined panel information to classify cell phenotypes based on their marker expression.
The cytomapper
package provides the cytomapperShiny function that allows
gating of cells based on their marker expression and visualization of
selected cells directly on the images.
library(cytomapper)
if (interactive()) {
cytomapperShiny(object = spe, mask = masks, image = images_comp,
cell_id = "ObjectNumber", img_id = "sample_id")
}
The labeled cells for this data set can be accessed at zenodo.org/record/7432486
and were downloaded previously. Per image, the
cytomapperShiny function allows the export of gated cells
in form of a SingleCellExperiment or
SpatialExperiment object. The cell label is stored in
colData(object)$cytomapper_CellLabel and the gates are
stored in metadata(object). In the next section, we will
read in and consolidate the labeled data.
For consistent visualization of cell types, we will now pre-define their colors:
celltype <- setNames(c("#3F1B03", "#F4AD31", "#894F36", "#1C750C", "#EF8ECC",
"#6471E2", "#4DB23B", "grey", "#F4800C", "#BF0A3D", "#066970"),
c("Tumor", "Stroma", "Myeloid", "CD8", "Plasma_cell",
"Treg", "CD4", "undefined", "BnTcell", "Bcell", "Neutrophil"))
metadata(spe)$color_vectors$celltype <- celltype
Here, we will read in the individual SpatialExperiment
objects containing the labeled cells and concatenate them. In the
process of concatenating the SpatialExperiment objects
along their columns, the sample_id entry is appended by
.1, .2, .3, ... due to replicated entries.
library(SingleCellExperiment)
label_files <- list.files("data/gated_cells",
full.names = TRUE, pattern = ".rds$")
# Read in SPE objects
spes <- lapply(label_files, readRDS)
# Merge SPE objects
concat_spe <- do.call("cbind", spes)
In the following code chunk we will identify cells that were labeled multiple times. This occurs when different cell phenotypes are gated per image and can affect immune cells that are located inside the tumor compartment.
We will first identify those cells that were uniquely labeled. In the
next step, we will identify those cells that were labeled twice AND were
labeled as Tumor cells. These cells will be assigned their immune cell
label. Finally, we will save the unique labels within the original
SpatialExperiment object.
cur_tab <- unclass(table(colnames(concat_spe),
concat_spe$cytomapper_CellLabel))
cur_labels <- rep("doublets", nrow(cur_tab))
names(cur_labels) <- rownames(cur_tab)
# Single assignments
single_index <- rowSums(cur_tab) == 1
cur_labels[single_index] <- colnames(cur_tab)[apply(cur_tab[single_index,], 1,
which.max)]
# Double assignment within the tumor
double_index <- rowSums(cur_tab) == 2 & cur_tab[,"Tumor"] == 1
no_tumor <- cur_tab[,colnames(cur_tab) != "Tumor"]
cur_labels[double_index] <- colnames(no_tumor)[apply(no_tumor[double_index,], 1,
which.max)]
# Remove doublets
cur_labels <- cur_labels[cur_labels != "doublets"]
table(cur_labels)
## cur_labels
## Bcell BnTcell CD4 CD8 Myeloid Neutrophil
## 780 1702 688 822 1750 134
## Plasma_cell Stroma Treg Tumor
## 716 442 361 5999
# Transfer labels to SPE object
spe_labels <- rep("unlabeled", ncol(spe))
names(spe_labels) <- colnames(spe)
spe_labels[names(cur_labels)] <- cur_labels
spe$cell_labels <- spe_labels
# Number of cells labeled per patient
table(spe$cell_labels, spe$patient_id)
##
## Patient1 Patient2 Patient3 Patient4
## Bcell 152 131 234 263
## BnTcell 396 37 240 1029
## CD4 45 342 167 134
## CD8 60 497 137 128
## Myeloid 183 378 672 517
## Neutrophil 97 4 17 16
## Plasma_cell 34 536 87 59
## Stroma 84 37 85 236
## Treg 139 149 49 24
## Tumor 2342 906 1618 1133
## unlabeled 7214 9780 7826 9580
Based on these labels, we can now train a random forest classifier to classify all remaining, unlabeled cells.
In this section, we will use the caret framework for machine learning in R. This package provides an interface to train a number of regression and classification models in a coherent fashion. We use a random forest classifier due to low number of parameters, high speed and an observed high performance for cell type classification [@Hoch2022].
In the following section, we will first split the
SpatialExperiment object into labeled and unlabeled cells.
Based on the labeled cells, we split the data into a train (75% of the
data) and test (25% of the data) dataset. We currently do not provide an
independently labeled validation dataset.
The caret package provides the trainControl
function, which specifies model training parameters and the
train function, which performs the actual model training.
While training the model, we also want to estimate the best model
parameters. In the case of the chosen random forest model
(method = "rf"), we only need to estimate a single
parameters (mtry) which corresponds to the number of
variables randomly sampled as candidates at each split. To estimate the
best parameter, we will perform a 5-fold cross validation (set within
trainControl) over a tune length of 5 entries to
mtry.
library(caret)
# Split between labeled and unlabeled cells
lab_spe <- spe[,spe$cell_labels != "unlabeled"]
unlab_spe <- spe[,spe$cell_labels == "unlabeled"]
# Randomly split into train and test data
set.seed(221029)
trainIndex <- createDataPartition(factor(lab_spe$cell_labels), p = 0.75)
train_spe <- lab_spe[,trainIndex$Resample1]
test_spe <- lab_spe[,-trainIndex$Resample1]
# Specify train parameters for 5-fold cross validation
fitControl <- trainControl(method = "cv",
number = 5)
# Select the data for training
cur_mat <- t(assay(train_spe, "exprs")[rowData(train_spe)$use_channel,])
# Train a random forest model for predicting cell labels
# This call also performs parameter tuning
rffit <- train(x = cur_mat,
y = factor(train_spe$cell_labels),
method = "rf", ntree = 1000,
tuneLength = 5,
trControl = fitControl)
rffit
## Random Forest
##
## 10049 samples
## 37 predictor
## 10 classes: 'Bcell', 'BnTcell', 'CD4', 'CD8', 'Myeloid', 'Neutrophil', 'Plasma_cell', 'Stroma', 'Treg', 'Tumor'
##
## No pre-processing
## Resampling: Cross-Validated (5 fold)
## Summary of sample sizes: 8040, 8039, 8038, 8038, 8041
## Resampling results across tuning parameters:
##
## mtry Accuracy Kappa
## 2 0.9674570 0.9565125
## 10 0.9766143 0.9688787
## 19 0.9774111 0.9699432
## 28 0.9781076 0.9708717
## 37 0.9766153 0.9688730
##
## Accuracy was used to select the optimal model using the largest value.
## The final value used for the model was mtry = 28.
We next observe the accuracy of the classifer when predicting cell phenotypes across the cross-validation as well as the test dataset.
First, we can visualize the classification accuracy during parameter tuning:
ggplot(rffit) +
geom_errorbar(data = rffit$results,
aes(ymin = Accuracy - AccuracySD,
ymax = Accuracy + AccuracySD),
width = 0.4) +
theme_classic(base_size = 15)
The best value for mtry is 28 and is used when
predicting new data.
It is often recommended to visualize the variable importance of the classifier. The following plot specifies which variables (markers) are most important for classifying the data.
plot(varImp(rffit))
As expected, the markers that were used for gating (Ecad, CD3, CD20, HLADR, CD8a, CD38, FOXP3) were important for classification.
To assess the accuracy, sensitivity, specificity, among other quality measures of the classifier, we will now predict cell phenotypes in the test data.
# Select test data
cur_mat <- t(assay(test_spe, "exprs")[rowData(test_spe)$use_channel,])
# Predict cell phenotypes in test data
cur_pred <- predict(rffit,
newdata = cur_mat)
While the overall classification accuracy can appear high, we also
want to check if each cell phenotype class is correctly predicted. For
this, we will calculate the confusion matrix between predicted and
actual cell labels. This measure highlights individual cell phenotype
classes that were not correctly predicted by the classifier. When
setting mode = "everything", the
confusionMatrix function returns all available prediction
measures including sensitivity, specificity, precision, recall and the
F1 score per cell phenotype class.
cm <- confusionMatrix(data = cur_pred,
reference = factor(test_spe$cell_labels),
mode = "everything")
cm
## Confusion Matrix and Statistics
##
## Reference
## Prediction Bcell BnTcell CD4 CD8 Myeloid Neutrophil Plasma_cell Stroma
## Bcell 187 2 0 0 0 0 7 0
## BnTcell 3 423 1 0 0 0 0 0
## CD4 0 0 161 0 0 2 3 2
## CD8 0 0 0 198 0 0 8 0
## Myeloid 0 0 2 2 437 0 0 0
## Neutrophil 0 0 0 0 0 30 0 0
## Plasma_cell 1 0 4 3 0 0 157 0
## Stroma 0 0 2 0 0 0 0 108
## Treg 0 0 0 0 0 0 3 0
## Tumor 4 0 2 2 0 1 1 0
## Reference
## Prediction Treg Tumor
## Bcell 0 0
## BnTcell 0 0
## CD4 0 4
## CD8 0 5
## Myeloid 0 0
## Neutrophil 0 0
## Plasma_cell 0 0
## Stroma 0 0
## Treg 90 2
## Tumor 0 1488
##
## Overall Statistics
##
## Accuracy : 0.9803
## 95% CI : (0.975, 0.9847)
## No Information Rate : 0.4481
## P-Value [Acc > NIR] : < 2.2e-16
##
## Kappa : 0.9737
##
## Mcnemar's Test P-Value : NA
##
## Statistics by Class:
##
## Class: Bcell Class: BnTcell Class: CD4 Class: CD8
## Sensitivity 0.95897 0.9953 0.93605 0.96585
## Specificity 0.99714 0.9986 0.99653 0.99586
## Pos Pred Value 0.95408 0.9906 0.93605 0.93839
## Neg Pred Value 0.99746 0.9993 0.99653 0.99777
## Precision 0.95408 0.9906 0.93605 0.93839
## Recall 0.95897 0.9953 0.93605 0.96585
## F1 0.95652 0.9930 0.93605 0.95192
## Prevalence 0.05830 0.1271 0.05142 0.06129
## Detection Rate 0.05590 0.1265 0.04813 0.05919
## Detection Prevalence 0.05859 0.1277 0.05142 0.06308
## Balanced Accuracy 0.97806 0.9970 0.96629 0.98086
## Class: Myeloid Class: Neutrophil Class: Plasma_cell
## Sensitivity 1.0000 0.909091 0.87709
## Specificity 0.9986 1.000000 0.99747
## Pos Pred Value 0.9909 1.000000 0.95152
## Neg Pred Value 1.0000 0.999095 0.99308
## Precision 0.9909 1.000000 0.95152
## Recall 1.0000 0.909091 0.87709
## F1 0.9954 0.952381 0.91279
## Prevalence 0.1306 0.009865 0.05351
## Detection Rate 0.1306 0.008969 0.04694
## Detection Prevalence 0.1318 0.008969 0.04933
## Balanced Accuracy 0.9993 0.954545 0.93728
## Class: Stroma Class: Treg Class: Tumor
## Sensitivity 0.98182 1.00000 0.9927
## Specificity 0.99938 0.99846 0.9946
## Pos Pred Value 0.98182 0.94737 0.9933
## Neg Pred Value 0.99938 1.00000 0.9940
## Precision 0.98182 0.94737 0.9933
## Recall 0.98182 1.00000 0.9927
## F1 0.98182 0.97297 0.9930
## Prevalence 0.03288 0.02691 0.4481
## Detection Rate 0.03229 0.02691 0.4448
## Detection Prevalence 0.03288 0.02840 0.4478
## Balanced Accuracy 0.99060 0.99923 0.9936
To easily visualize these results, we can now plot the true positive rate (sensitivity) versus the false positive rate (1 - specificity):
library(tidyverse)
data.frame(cm$byClass) %>%
mutate(class = sub("Class: ", "", rownames(cm$byClass))) %>%
ggplot() +
geom_point(aes(1 - Specificity, Sensitivity,
size = Detection.Rate,
fill = class),
shape = 21) +
scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +
theme_classic(base_size = 15) +
ylab("Sensitivity (TPR)") +
xlab("1 - Specificity (FPR)")
We observe high sensitivity and specificity for most cell types. Plasma cells show the lowest true positive rate with 88% being sufficiently high. The size of the circle specifies the number of cells per class.
Finally, to observe which cell phenotypes were wrongly classified, we can visualize the distribution of classification probabilities per cell phenotype class:
cur_pred <- predict(rffit,
newdata = cur_mat,
type = "prob")
cur_pred$truth <- factor(test_spe$cell_labels)
cur_pred %>%
pivot_longer(cols = Bcell:Tumor) %>%
ggplot() +
geom_boxplot(aes(x = name, y = value, fill = name), outlier.size = 0.5) +
facet_wrap(. ~ truth, ncol = 1) +
scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +
theme(panel.background = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1))
The boxplots indicate the classification probabilities per class. The classifier is well trained if classification probabilities are only high for the one specific class.
In the final section, we will now use the tuned and tested random forest classifier to predict the cell phenotypes of the unlabeled data.
First, we predict the cell phenotypes and extract their classification probabilities.
# Select unlabeled data
cur_mat <- t(assay(unlab_spe, "exprs")[rowData(unlab_spe)$use_channel,])
# Predict cell phenotypes
cell_class <- as.character(predict.train(rffit,
newdata = cur_mat,
type = "raw"))
names(cell_class) <- rownames(cur_mat)
table(cell_class)
## cell_class
## Bcell BnTcell CD4 CD8 Myeloid Neutrophil
## 808 985 3604 2674 5641 558
## Plasma_cell Stroma Treg Tumor
## 3381 4718 1162 10869
# Extract classification probabilities
cell_prob <- predict.train(rffit,
newdata = cur_mat,
type = "prob")
Each cell is assigned to the class with highest probability. There
are however cases, where the highest probability is low meaning the cell
can not be uniquely assigned to a class. We next want to identify these
cells and label them as “undefined”. Here, we select a maximum
classification probability threshold of 40% but this threshold needs to
be adjusted for other datasets. The adjusted cell labels are then stored
in the SpatialExperiment object.
library(ggridges)
# Distribution of maximum probabilities
tibble(max_prob = rowMax(as.matrix(cell_prob)),
type = cell_class) %>%
ggplot() +
geom_density_ridges(aes(x = max_prob, y = cell_class, fill = cell_class)) +
scale_fill_manual(values = metadata(spe)$color_vectors$celltype) +
theme_classic(base_size = 15) +
xlab("Maximum probability") +
ylab("Cell type") +
xlim(c(0,1.2))
## Picking joint bandwidth of 0.0206
# Label undefined cells
cell_class[rowMax(as.matrix(cell_prob)) < 0.4] <- "undefined"
# Store labels in SpatialExperiment onject
cell_labels <- spe$cell_labels
cell_labels[colnames(unlab_spe)] <- cell_class
spe$celltype <- cell_labels
table(spe$celltype, spe$patient_id)
##
## Patient1 Patient2 Patient3 Patient4
## Bcell 175 531 422 459
## BnTcell 416 585 591 1086
## CD4 421 1485 726 1557
## CD8 506 1341 485 1149
## Myeloid 1215 1910 1526 2657
## Neutrophil 352 9 136 181
## Plasma_cell 816 2483 421 341
## Stroma 614 611 697 3223
## Treg 550 416 256 293
## Tumor 5592 3346 5817 2102
## undefined 89 80 55 71
After classifying cells based on their phenotype in the dataset we will need to perform quality control to ensure correct labelling.
First, we will visualize the cell labels on the UMAP embedding:
dittoDimPlot(spe, var = "celltype",
reduction.use = "UMAP_mnnCorrected", size = 0.2,
do.label = TRUE) +
scale_color_manual(values = metadata(spe)$color_vectors$celltype) +
theme(legend.title = element_blank()) +
ggtitle("Cell types on UMAP, integrated cells")
We can also visualize each cells expression in form of a heatmap while labelling the columns by cell phenotype.
#Heatmap visualization - DittoHeatmap
dittoHeatmap(spe[,cur_cells],
genes = rownames(spe)[rowData(spe)$use_channel],
assay = "exprs", order.by = c("celltype"),
cluster_cols = FALSE, scale = "none",
heatmap.colors = viridis(100), annot.by = c("celltype","indication","patient_id"),
annotation_colors = list(indication = metadata(spe)$color_vectors$indication,
patient_id = metadata(spe)$color_vectors$patient_id,
celltype = metadata(spe)$color_vectors$celltype))
We can now also visualize the differences in cell phenotypes between patients in form of barplots.
# by patient_id - percentage
dittoBarPlot(spe, var = "celltype", group.by = "patient_id") +
scale_fill_manual(values = metadata(spe)$color_vectors$celltype)
## Scale for fill is already present.
## Adding another scale for fill, which will replace the existing scale.
The cytomapper package further provides functionality to
visualize the cell phenotypes on segmentation masks. For visualization
purposes, we select again three images.
# Sample images
set.seed(220517)
cur_id <- sample(unique(spe$sample_id), 3)
cur_images <- images_comp[names(images_comp) %in% cur_id]
cur_masks <- masks[names(masks) %in% cur_id]
plotCells(cur_masks,
object = spe,
cell_id = "ObjectNumber", img_id = "sample_id",
colour_by = "celltype")
A recommended way of visualizing segmentation quality is to select a cell phenotype of interest and outline the cells on composite images. In the following example, we select CD8+ T cells and visualize them on composite images displaying the marker intensity of CD3 and CD8.
CD8 <- spe[,spe$celltype == "CD8"]
plotPixels(image = cur_images,
mask = cur_masks,
object = CD8,
cell_id = "ObjectNumber", img_id = "sample_id",
colour_by = c("CD3", "CD8a"),
outline_by = "celltype",
bcg = list(CD3 = c(0, 5, 1),
CD8a = c(0, 5, 1)),
colour = list(celltype = c("CD8" = "white")),
thick = TRUE)
For easier visualization the plot can also be written out:
if (!dir.exists("data/CellValidation")) dir.create("data/CellValidation")
plotPixels(image = cur_images,
mask = cur_masks,
object = CD8,
cell_id = "ObjectNumber", img_id = "sample_id",
colour_by = c("CD3", "CD8a"),
outline_by = "celltype",
bcg = list(CD3 = c(0, 5, 1),
CD8a = c(0, 5, 1)),
colour = list(celltype = c("CD8" = "white")),
thick = TRUE,
save_plot = list(filename = "data/CellValidation/CD8.png"))
Finally, the generated data objects can be saved for further downstream processing and analysis.
saveRDS(spe, "data/spe.rds")
sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] ggridges_0.5.4 forcats_0.5.2
## [3] stringr_1.5.0 dplyr_1.0.10
## [5] purrr_0.3.5 readr_2.1.3
## [7] tidyr_1.2.1 tibble_3.1.8
## [9] tidyverse_1.3.2 caret_6.0-93
## [11] lattice_0.20-45 igraph_1.3.5
## [13] Rphenograph_0.99.1.9003 viridis_0.6.2
## [15] viridisLite_0.4.1 cowplot_1.1.1
## [17] scater_1.26.1 scuttle_1.8.2
## [19] batchelor_1.14.0 BiocParallel_1.32.4
## [21] tiff_0.1-11 cytomapper_1.10.0
## [23] EBImage_4.40.0 patchwork_1.1.2
## [25] dittoSeq_1.10.0 ggplot2_3.4.0
## [27] pheatmap_1.0.12 CATALYST_1.22.0
## [29] imcRtools_1.4.2 SpatialExperiment_1.8.0
## [31] SingleCellExperiment_1.20.0 SummarizedExperiment_1.28.0
## [33] Biobase_2.58.0 GenomicRanges_1.50.1
## [35] GenomeInfoDb_1.34.4 IRanges_2.32.0
## [37] S4Vectors_0.36.1 BiocGenerics_0.44.0
## [39] MatrixGenerics_1.10.0 matrixStats_0.63.0
##
## loaded via a namespace (and not attached):
## [1] ModelMetrics_1.2.2.2 R.methodsS3_1.8.2
## [3] bit64_4.0.5 knitr_1.41
## [5] irlba_2.3.5.1 multcomp_1.4-20
## [7] DelayedArray_0.24.0 R.utils_2.12.2
## [9] rpart_4.1.19 data.table_1.14.6
## [11] hardhat_1.2.0 RCurl_1.98-1.9
## [13] doParallel_1.0.17 generics_0.1.3
## [15] flowCore_2.10.0 ScaledMatrix_1.6.0
## [17] terra_1.6-47 TH.data_1.1-1
## [19] RANN_2.6.1 future_1.29.0
## [21] proxy_0.4-27 bit_4.0.5
## [23] tzdb_0.3.0 xml2_1.3.3
## [25] lubridate_1.9.0 httpuv_1.6.7
## [27] assertthat_0.2.1 gargle_1.2.1
## [29] gower_1.0.0 xfun_0.35
## [31] hms_1.1.2 jquerylib_0.1.4
## [33] evaluate_0.19 promises_1.2.0.1
## [35] fansi_1.0.3 readxl_1.4.1
## [37] dbplyr_2.2.1 DBI_1.1.3
## [39] htmlwidgets_1.5.4 googledrive_2.0.0
## [41] ellipsis_0.3.2 ggnewscale_0.4.8
## [43] ggpubr_0.5.0 backports_1.4.1
## [45] cytolib_2.10.0 svgPanZoom_0.3.4
## [47] sparseMatrixStats_1.10.0 vctrs_0.5.1
## [49] abind_1.4-5 cachem_1.0.6
## [51] withr_2.5.0 ggforce_0.4.1
## [53] vroom_1.6.0 svglite_2.1.0
## [55] cluster_2.1.4 crayon_1.5.2
## [57] drc_3.0-1 recipes_1.0.3
## [59] edgeR_3.40.0 pkgconfig_2.0.3
## [61] labeling_0.4.2 units_0.8-1
## [63] tweenr_2.0.2 nlme_3.1-160
## [65] vipor_0.4.5 nnet_7.3-18
## [67] globals_0.16.2 rlang_1.0.6
## [69] lifecycle_1.0.3 sandwich_3.0-2
## [71] modelr_0.1.10 rsvd_1.0.5
## [73] cellranger_1.1.0 randomForest_4.7-1.1
## [75] polyclip_1.10-4 Matrix_1.5-3
## [77] raster_3.6-11 carData_3.0-5
## [79] Rhdf5lib_1.20.0 zoo_1.8-11
## [81] reprex_2.0.2 beeswarm_0.4.0
## [83] RTriangle_1.6-0.11 googlesheets4_1.0.1
## [85] GlobalOptions_0.1.2 png_0.1-8
## [87] rjson_0.2.21 bitops_1.0-7
## [89] shinydashboard_0.7.2 R.oo_1.25.0
## [91] pROC_1.18.0 ConsensusClusterPlus_1.62.0
## [93] KernSmooth_2.23-20 rhdf5filters_1.10.0
## [95] DelayedMatrixStats_1.20.0 shape_1.4.6
## [97] classInt_0.4-8 parallelly_1.33.0
## [99] jpeg_0.1-10 rstatix_0.7.1
## [101] ggsignif_0.6.4 beachmat_2.14.0
## [103] scales_1.2.1 magrittr_2.0.3
## [105] plyr_1.8.8 zlibbioc_1.44.0
## [107] compiler_4.2.1 dqrng_0.3.0
## [109] concaveman_1.1.0 RColorBrewer_1.1-3
## [111] plotrix_3.8-2 clue_0.3-63
## [113] cli_3.4.1 XVector_0.38.0
## [115] listenv_0.8.0 FlowSOM_2.6.0
## [117] MASS_7.3-58.1 tidyselect_1.2.0
## [119] stringi_1.7.8 RProtoBufLib_2.10.0
## [121] highr_0.9 yaml_2.3.6
## [123] BiocSingular_1.14.0 locfit_1.5-9.6
## [125] ggrepel_0.9.2 grid_4.2.1
## [127] sass_0.4.4 timechange_0.1.1
## [129] tools_4.2.1 future.apply_1.10.0
## [131] circlize_0.4.15 rstudioapi_0.14
## [133] foreach_1.5.2 gridExtra_2.3
## [135] prodlim_2019.11.13 farver_2.1.1
## [137] Rtsne_0.16 ggraph_2.1.0
## [139] DropletUtils_1.18.1 digest_0.6.31
## [141] lava_1.7.0 shiny_1.7.3
## [143] Rcpp_1.0.9 car_3.1-1
## [145] broom_1.0.1 later_1.3.0
## [147] RcppAnnoy_0.0.20 httr_1.4.4
## [149] sf_1.0-9 ComplexHeatmap_2.14.0
## [151] distances_0.1.8 colorspace_2.0-3
## [153] rvest_1.0.3 fs_1.5.2
## [155] XML_3.99-0.13 splines_4.2.1
## [157] uwot_0.1.14 graphlayouts_0.8.4
## [159] sp_1.5-1 systemfonts_1.0.4
## [161] xtable_1.8-4 jsonlite_1.8.4
## [163] tidygraph_1.2.2 timeDate_4021.107
## [165] ipred_0.9-13 R6_2.5.1
## [167] pillar_1.8.1 htmltools_0.5.4
## [169] mime_0.12 nnls_1.4
## [171] glue_1.6.2 fastmap_1.1.0
## [173] DT_0.26 BiocNeighbors_1.16.0
## [175] fftwtools_0.9-11 class_7.3-20
## [177] codetools_0.2-18 mvtnorm_1.1-3
## [179] utf8_1.2.2 bslib_0.4.1
## [181] ResidualMatrix_1.8.0 ggbeeswarm_0.6.0
## [183] colorRamps_2.3.1 gtools_3.9.4
## [185] magick_2.7.3 survival_3.4-0
## [187] limma_3.54.0 rmarkdown_2.18
## [189] munsell_0.5.0 e1071_1.7-12
## [191] GetoptLong_1.0.5 rhdf5_2.42.0
## [193] GenomeInfoDbData_1.2.9 iterators_1.0.14
## [195] HDF5Array_1.26.0 haven_2.5.1
## [197] reshape2_1.4.4 gtable_0.3.1